*****************************************************************;
* SCRIPTS pseudo_id.zip     2025/26/03 (V2)                     *;
* Olivier Godechot                                              *;
* Email: Olivier.Godechot [at] sciencespo.fr                    *;
*****************************************************************;
* SET OF FILES FOR CREATING PSEUDO IDENTIFIER IN THE DADS FILES *; 
*****************************************************************;

The zip file pseudo_id.zip contains the following SAS scripts : 

  1. S1_pseudo_id_import_parquet.R (compulsory for running S2 up to 2023)
      This file import the parquet files and store them in the parquet_dta folder. 
      It will be used by S2 to create the pseudo_id for 2022 & 2023.

  2. S2_pseudo_id.sas (compulsory)
      It is a program for the CASD platform which creates 
      a common identifier IDENT_ALL in the DADS files thanks to overlapping 
      information between year t-1 of yearfile y and year t of yearfile y-1.
      It enables to chain the stayers and the movers from 2002 to 2023
      (provided the movers reappear in the DADS perimeter in less than one year). 
      Before 2001, while we cannot chain the movers (who change workplace), we can 
      still chain the stayers.

  3. S3a_pseudo_id_seniority.sas (optionnal) 
      It calculates the year of entry in the firm and the establishment and enables to 
      calculate seniority. It works up to 2021, for DADS files in SAS mode.

  4. S3b_pseudo_id_sen_2022.R  (optionnal) 
      It calculates the year of entry in the firm and the establishment and enables to 
      calculate seniority. It works up for 2022 and subsequent years, generating files 
      either in R format (.rds) or Stata (.dta).
      You will need to run S3a_pseudo_id_seniority.sas before hand. 

  5. S3c_pseudo_id_foreign_born.sas  (optionnal) 
      It corrects information on foreign and over-sea borns and citizenship which is obviously incorrect
      for some years. It works up to 2021, for DADS files in SAS mode.

  6. S3d_pseudo_id_use.sas  (optionnal) 
      If gives an example of creating a dads files with common identifiers. It optionally adds information 
      on seniority and corrected location of birth (for this, you need to run respectively S3a & S3c). 
      It works up to 2021, for DADS files in SAS mode.

  7. S3e_pseudo_id_use_2022.R  (optionnal) 
      If gives an example of creating a dads files with common identifiers. 
      It works for 2022 and subsequent years, generating files either in R format (.rds) or Stata (.dta).


Don't hesitate to ask questions or to comment.

Possible updates can be found here : http://olivier.godechot.free.fr/hopfichiers/pseudo_id.zip
